Getting More from RStudio

Michael Clark

June 13, 2016

Overview

Overview

Overview

Base R is an extremely powerful tool. However…

The syntax editor in Base R is a simple text editor and nothing more.

You should use something more efficient and easy to use.

Options:

  • RStudio
  • Emacs
  • Vim
  • Various others

Note however, there is an Emacs (and Vim) mode within RStudio.

Overview

RStudio offers:

  • Code completion and snippets
  • Code diagnostics
  • Customizable shortcuts
  • Document generation (web, pdf, presentation, .doc)
  • Web publishing

Overview

RStudio offers:

  • Enhanced debugging, profiling
  • Navigable data frames
  • Version control
  • Interactive visualization
  • Addins

Overview

RStudio is also an excellent tool for reproducible research.

  • Project management
  • Package building and freezing
  • Document generation

In short, you can go from data import to (modern, web-based) publication and replication.

Scripting

Scripting

Everyone who uses RStudio does benefits from easier scripting, including:

  • Syntax highlighting
  • Autocomplete of object/function names, etc.
  • Autopairing of parenthesis, quotes etc.
  • Auto indent

It even makes working at the console viable.

  • Still not advised

Keyboard shortcuts: standard scripts

Knowing a few shortcuts can save a lot of time in the long run.

Examples (Windows and Linux):

Run current line: Ctrl+Entr

Copy up/down: Ctrl+Shft+up/down

Move up/down: Shft+up/down

Run everything: Ctrl+Shft+Entr

Insert section: Ctrl+Shft+R

Select Window: Ctrl+1:9

Alt+Shft+K

Some favorite shortcuts

Multicursor: Ctrl+Alt+select

Move lines: (shift+) alt + arrow

Clear console: Ctrl + L

Restart R: Ctrl + Shft + F10

Scripting/Console Window: Ctrl+1/Ctrl+2

Rerun previous: Ctrl + Shft + P

Run everything before: Ctrl + Alt + B

Run everything before: Ctrl + Alt + E

Knit: Ctrl + Shft + K

Keyboard shortcuts

The point is, knowing just a dozen shortcuts could save a lot of time.

  • Bonus: Ctrl+Shft+a (tidy up your code)

Mac users: most of these would use Cmd rather than Ctrl (but not always)

Snippets

Snippets allow one to insert code of a certain form for commonly used functions.

You only have to type the first couple letters, the form of the rest of the code will fill out, then you can tab your way through the rest of it.

Code Diagnostics

RStudio will note problems in your code in the margin.

  • Examples: hanging brackets, too many commas etc.

This works beyond just R scripts too!

Customization

Customization

RStudio allows one to customize various aspects of how it looks and how you interact with it.

In addition, these customizations can be specific to all RStudio sessions or for a particular project you’re working on.

  • Tools/Global Options
  • Tools/Project Options

Customization

As a starting point, one would want to maybe change the look of it.

  • Code look, indentation, highlighting etc.
  • Window pane locations

Customization

Customization

While you may certainly want to change things such as the look, to not save the workspace automatically etc., the main point is simply to be aware of what you can change.

Projects

Projects

Projects provide a self-contained ecosystem within which to work.

  • Projects have their their own working directory, workspace, etc.

If you have multiple projects, you can easily jump between them.

  • Without losing your place

Projects

File/New Project

Usually you’ll select new, but you’ll want to note the other options.

  • We’ll talk about version control later.

Projects

All tabs opened will remain open when you revisit the project.

You can have multiple projects running at the same time

  • i.e. multiple RStudio instances

Projects can be seen as a first step in getting more organized, more reproducible etc.

Rmarkdown

What is R?

R is fast becoming a general programming environment rather than just a statistical one.

Markdown is a language that allows for easier web-based documentation.

  • Not necessary to know html

Now one can intermingle R with markdown, html, css, JavaScript, \(\</span>LaTeX\) and others resulting in a variety of products

Rmarkdown

  • html, pdf, doc
  • presentations (like this one)
  • dashboards
  • notebooks
  • websites
  • other publication

Example

File/New/R Markdown…

Or many of the others too.

R chunks are interspersed throughout the Rmd file, combining code, plain text, markdown and possibly others.

Rmarkdown

Once your file is ready, knit the document into the format your want.

  • Ctrl/Cmd-Shft-K

An Rmarkdown workshop will be given in the future for more details.

Interactive and Visual Data Exploration

The Viewer

In addition to the “Plots” pane, RStudio also provides a “Viewer” pane.

Anything interactive will be displayed there.

Packages

ggplot2 Is the most widely used package for visualization in R.

However, it is not interactive by default.

Many packages use htmlwidgets and d3 (JavaScript library) to provide interactive graphics.

Packages

Some packages to note:

  • plotly
    • used also in Python, Matlab, Julia, aside from many interactive plots, can convert ggplot2 images to interactive ones.
  • ggvis
    • interactive successor to to ggplot though not currently actively developed
  • rbokeh
    • like plotly, also has cross program support
  • DT
    • interactive data

Example

Works in your presentations too.

Shiny

Shiny is a framework that can essentially allow you to build an interactive website.

Most of the more recently developed* packages will work specifically within the shiny and rmarkdown settings.




*R has a long history of providing interactive graphics, but most of it was very poor.

Quick Wrap

RStudio lets you take a deeper look at your data with easy interactivity.

Interactive tables and plots go a long way to helping you understand your data better.

Addins

Addins

RStudio allows its users to create functions that can be used within RStudio with a click or keystroke.

These special functions are called addins.

Addins are a great way to increase your productivity and efficiency when scripting.

They can be anything, but the easiest (and perhaps most useful) example is text insertion/formatting.

  • Saves a lot of time in document creation

Creating Addins

Addins are nothing more than R functions that you can call interactively.

likeR()
I like R!

Example: ColorPicker

More Advanced

Debugging

Profiling

What is Debugging?

Debugging is merely finding and fixing problematic code.

  • Code will always have bugs

Debugging is an absolutely essential part of creating functions.

A note about functions

If you are doing anything more than twice, you should write a function instead.

  • It’s more generalizable
  • It’s more reproducible
  • It’s more efficient
  • Debugging can allow one to spot issues

  • RStudio can even help you get started transforming existing code to a function

    • Ctrl+Alt+X/Cmnd+Opt+X on highlighted code

Debugging in RStudio

There are numerous facilities within R to help you debug your code.

  • Break Points

  • browser

  • debug

  • debugonce

  • traceback

RStudio makes the process pretty easy.

Debug Mode Commands

There are commands that allow you to work through debugging:

  • Next (n)/Return: runs the next line

  • Step into (s): if the next line is a new function, it enters into the function

  • Careful with this one; you can get pretty far into other functions

  • Finish (f): finishes the function

  • Continue (c): stops debugging and runs the function

  • Stop (Q): stops debugging and does not run the function

Debug Mode Commands

Each of these also has a button in the debugging menu

Profiling

Code profiling allows one to see what parts of the code take most of the processing time and resources (memory)

Like debugging, there have always been tools in base R for this, but RStudio makes it easy to profile any code.

Furthermore, it doesn’t have to be an explicit function.

Profiling

Quick Wrap

  • Debugging and profiling is an important part of advanced programming.
  • Regardless of expertise, one should desire to make code as general and reproducible as possible.
  • RStudio makes the process more interactive and flexible.

Package Development

Package Development

RStudio makes package development easier too.

  • New Project > New Directory > R Package

R Package Dialog Box

“Create package based on source files:” allows you to include previously written functions in your new package.

When the package gets created, each of the functions you added at this step will have their own help files created.

  • You will still need to complete the help files, but at least they are there.

What You Get

RStudio will automatically start you out with the following:

  • DESCRIPTION: Just like every R package

  • A ‘man’ folder: Contains .Rd files for each function

  • An ‘R’ folder: Contains your functions.

The roxygen2 package helps to properly format your documentation files.

Build, Reload, Check

After you have all of your files ready, you can build the package.

Packages tend to have a lot happening in them.

To help you make sure that the package has everything it needs, you can run the check function from devtools on it.

It will check package quality across many dimensions:

  • Ability to install package and its dependencies

  • Checking help file quality

  • Find errors in examples

Quick Wrap

RStudio has built-in tools that make package creation a straight-forward process.

You should not be afraid to create your own packages, even for just personal use

  • Within a project: greater reproducibility

Version Control

Overview

RStudio offers the ability to integrate version control into your project.

  • Subversion
  • Git
  • Both are free and open (we’ll focus on Git)

Wait, Wait! What Is Version Control?

At its most basic, it is just a way to manage changes.

  • Documents, code, etc.

Especially useful when collaborating.

  • Keep track of who is making changes and what they are changing
  • Revert changes back to an earlier version
  • Merging multiple copies of a document into one
  • Branched development

Git

Git works on a distributed model

  • Users create their own local repositories

Tools that use Git to share code on the web

  • Bitbucket
  • GitLab
  • GitHub

Public vs. private repositories

GitHub

GitHub is a web-based hosting services that allows you to upload your Git repository.

Essentially a social Network for software and other developers.

Process (Briefly)

  • Commit changes made (including file creation)
  • Push up to repository
  • Pull from repository changes others have made

GUI vs. the Shell

RStudio makes it easy to commit, push, pull, revert, check diffs etc.

If you need other things, you can access the Git shell directly

Quick Wrap

RStudio makes it easy to integrate version control into your project.

You have nothing to lose by keeping track of files and the changes that have been made to them.

This is especially useful when collaborating.

Cheat Sheets

Cheat Sheets - RStudio Style

RStudio wants everything to be easy for us as R users.

As such, they have produced a series of cheat sheets as reference material.

https://www.rstudio.com/resources/cheatsheets/

RStudio

RStudio even has a cheatsheet for using RStudio!

It provides a high-level overview for many of the things we are talking about here.

It also has a comprehensive list of keyboard shortcuts.

  • Alt + Shift + K will bring them up in RStudio.
  • Shortcuts can save you a lot of time.
  • Do show care in your keystrokes…otherwise, you might find your screen rotated or your keyboard is producing Hebrew characters.

Data Visualization

It is essentially a primer on using ggplot2.

It effectively communicates the various geoms.

For the beginning ggplot2 user, the following sections are indispensable:

  • Scales
  • Coordinate Systems
  • Faceting
  • Position Adjustments

Data Wrangling

Data wrangling is essentially just a fun way of saying data cleaning and prep.

The cheat sheet offers some useful tips on using two handy packages:

  • dplyr

    • Handles all manners of data subsetting, filtering, variable selections, grouping, summarizing, etc.
  • tidyr

    • Used for reshaping data (wide to long, long to wide).

R Markdown

R Markdown is used to generate reproducible documents with R.

Your document can contain code, data, analyses, visualizations, or anything else that you want to include.

You may also include html, css, javascript, and \(\LaTeX\) in your documents.

R Markdown documents can be saved as html, pdf, or even Word documents.

R Markdown Reference Guide

Using rmarkdown is in part a combination of three different things:

markdown

  • The basic structure of the document (headings, sections, text)

knitr

  • Controls how R is used within the document

pandoc

  • Controls the output (html, pdf; document, presentation)

Package Development

RStudio makes package development accessible to anyone.

It has many capacities for helping you to create packages:

  • automatic file creation with roxygen2

The cheat sheet details using devtools.

  • devtools was created specifically for package development

Shiny

Shiny is a web page that allows users to interact with an R session.

  • Users can interact with the data, models, visualizations, etc.

Quick Wrap

RStudio wants to make things easy on you!

Having a handy copy of the cheat sheets will serve you well!